Reinforcement Causal Structure Learning on Order Graph

نویسندگان

چکیده

Learning directed acyclic graph (DAG) that describes the causality of observed data is a very challenging but important task. Due to limited quantity and quality data, non-identifiability causal graph, it almost impossible infer single precise DAG. Some methods approximate posterior distribution DAGs explore DAG space via Markov chain Monte Carlo (MCMC), over nature super-exponential growth, accurately characterizing whole intractable. In this paper, we propose Reinforcement Causal Structure on Order Graph (RCL-OG) uses order instead MCMC model different topological orderings reduce problem size. RCL-OG first defines reinforcement learning with new reward mechanism in an efficacy way, deep Q-learning update transfer rewards between nodes. Next, obtains probability transition nodes computes orderings. can sample obtain ordering high probability. Experiments synthetic benchmark datasets show provides accurate approximation achieves better results than competitive discovery algorithms.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Order-independent constraint-based causal structure learning

We consider constraint-based methods for causal structure learning, such as the PC-, FCI-, RFCIand CCDalgorithms (Spirtes et al. (2000, 1993), Richardson (1996), Colombo et al. (2012), Claassen et al. (2013)). The first step of all these algorithms consists of the PCalgorithm. This algorithm is known to be order-dependent, in the sense that the output can depend on the order in which the variab...

متن کامل

Reinforcement learning and causal models

This chapter reviews the diverse roles that causal knowledge plays in reinforcement learning. The first half of the chapter contrasts a “model-free” system that learns to repeat actions that lead to reward with a “model-based” system that learns a probabilistic causal model of the environment which it then uses to plan action sequences. Evidence suggests that these two systems coexist in the br...

متن کامل

Learning Higher-Order Graph Structure with Features by Structure Penalty

In discrete undirected graphical models, the conditional independence of node labels Y is specified by the graph structure. We study the case where there is another input random vector X (e.g. observed features) such that the distribution P (Y | X) is determined by functions of X that characterize the (higher-order) interactions among the Y ’s. The main contribution of this paper is to learn th...

متن کامل

Abolishing the effect of reinforcement delay on human causal learning.

Associative learning theory postulates two main determinants for human causal learning: contingency and contiguity. In line with such an account, participants in Shanks, Pearson, and Dickinson (1989) failed to discover causal relations involving delays of more than two seconds. More recent research has shown that the impact of contiguity and delay is mediated by prior knowledge about the timefr...

متن کامل

Partial Order Hierarchical Reinforcement Learning

In this paper the notion of a partial-order plan is extended to task-hierarchies. We introduce the concept of a partial-order taskhierarchy that decomposes a problem using multi-tasking actions. We go further and show how a problem can be automatically decomposed into a partial-order task-hierarchy, and solved using hierarchical reinforcement learning. The problem structure determines the reduc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2023

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v37i9.26274